Factors associated to the duration of COVID-19 lockdowns in Chile

During the first year of the COVID-19 pandemic, several countries have implemented non-pharmacologic measures, mainly lockdowns and social distancing, to reduce the spread of the SARS-CoV-2 virus. These strategies varied widely across nations, and their efficacy is currently being studied. This study explores demographic, socioeconomic, and epidemiological factors associated with the duration of lockdowns applied in Chile between March 25th and December 25th, 2020. Joint models for longitudinal and time-to-event data were used. In this case, the number of days under lockdown for each Chilean commune and longitudinal information were modeled jointly. Our results indicate that overcrowding, number of active cases, and positivity index are significantly associated with the duration of lockdowns, being identified as risk factors for longer lockdown duration. In short, joint models for longitudinal and time-to-event data permit the identification of factors associated with the duration of lockdowns in Chile. Indeed, our findings suggest that demographic, socioeconomic, and epidemiological factors should be used to define both entering and exiting lockdown.


Study design.
Factors associated with the duration of localized lockdowns during the COVID-19 pandemic in Chile were assessed using a retrospective cohort study design 5 . In this design, researchers select a group of study units (i.e., commune under lockdown) in such a way that they have been exposed to different levels of certain predictors, and then follow them retrospectively over time to record the occurrence or not of a predefined event. In this particular investigation, the main outcome corresponds to the time elapsed until the end of the lockdown (i.e., the lockdown duration). No formal sampling of communes under lockdown was performed since all lockdowns implemented between March 25th and December 25th, 2020 were included in this study. All methods were performed in accordance with relevant guidelines and regulations.
Outcomes. In follow-up studies, two types of data can be observed: information collected over time (longitudinal data) and time until an event of interest occurs (survival data). In this study, both were included as outcomes. For the time-to-event analysis, we considered the number of days a commune was under lockdown; to do so, we recorded the number and duration of lockdowns implemented in Chile between March 25th and December 25th, 2020. If a commune was still under lockdown on December 25th, it was considered a right-censored observation. In general, lockdowns were implemented at the commune-level; however, at the beginning of the pandemic, some lockdowns were established at the greater administrative division level (i.e., province), which implies that a set of communes followed the same schedule. In this study, lockdowns were considered as commune-level.
For the longitudinal analysis, we studied several epidemiological factors collected at the first or third levels of the administrative division. The number of new asymptomatic and symptomatic cases, the number of patients in ICU, and the number of PCR exams performed were collected at the regional division level. At the commune level, we observed the number of active cases and the deaths per COVID-19 according to their residence. An active case was defined as a living person who met the definition criteria of a suspected case with a positive sample of SARS-CoV-2, whose date of onset of symptoms in the notification was less than or equal to 11 days, i.e., people capable of transmitting the infection. On the other hand, new symptomatic/asymptomatic cases correspond to new cases reported in a daily basis. All this information was considered as the number per 100,000 inhabitants. The original information used in this study was published by the Chilean Ministry of Science, Technology, Knowledge and Innovation and can be found on the GitHub repository available at https:// github. com/ MinCi encia/ Datos-COVID 19.
Based on the number of new asymptomatic and symptomatic cases and the number of PCR exams performed per region, we calculated a positivity index at the region level, which was expressed as the percentage of new cases relative to the number of PCR exams taken: As the epidemiological information varies over time, we used two strategies to include them in the model. The number of deaths was included as a weekly sum during the period the commune was under lockdown, while the others were considered as daily average within each week.
Predictors. Demographic and socioeconomic factors were considered as predictors for the time-to-event model. These consisted of population size (in scale of 100,000 inhabitants), number of immigrants (per 100,000 inhabitants), population density (number of people per km 2 ), overcrowding (number of people over the number of households), a socioeconomic development index (SDI, ranging from 0 to 1), and a rural index of the communes (ranging from 0 to 1). For calculating SDI, which is performed by the Universidad Autónoma de Chile, different indicators are aggregated, including economy (monthly per capita income and poverty), education (average years of schooling), and housing and sanitation (good and acceptable housing material and sewerage or septic tank) 6 . For the calculation of the rural index, computed by the Ministry of Social and Family Development, it is considered the percentage of the rural population, the proportion of local employment occupied positivity index = number of asymptomatic cases + number of symptomatic cases number of PCR exams . www.nature.com/scientificreports/ in primary sectors, and the population density. Then, an average of these three values was calculated, resulting in the rural index. Polanco 7 has provided details about how to calculate such measure. Besides, we considered whether the commune held the regional or province city capital and whether a commercial airport or harbor exist in it. Apart from demographic and socioeconomic factors, we also included a binary covariate indicating if it was the first or second time that the commune was under lockdown.
Statistical modeling. The two sources of information presented in this study are often analyzed separately through a survival analysis and a longitudinal analysis. However, in some situations, one may also be interested in the association between longitudinal measurements and the event of interest. In these cases, a joint approach is indicated, where information is shared between two or more models and each part provide relevant knowledge to the other. This procedure depends on the type of time-dependent covariates 8 . When this information is exogenous, i.e., variables whose cause is external to the model, an extended Cox model can be used 9 . On the other hand, when the longitudinal covariates are endogenous, i.e., variables that are changed or determined by their relationship with others, it is necessary to use a new class of models known as joint models 10 . The idea behind joint modeling of longitudinal and time-to-event data is to couple a model for repeated measurements with a survival model to explain the event of interest. The most common joint model specification is to connect a mixed-effects sub-model fitted to describe the evolution of the longitudinal information with a proportional hazard sub-model fitted to the survival information. This approach had been limited to a single longitudinal and a single time-to-event outcome for a long time. However, a model with multiple longitudinal and/or multiple time-to-event outcomes can also be considered 11,12 . Thus, a joint model for k longitudinal outcomes can be formulated as follows: where y ik (t) = (y i1 T (t), . . . , y il T (t)) represents the k-variate vector of continuous longitudinal measurement for the i th commune at time t with k = 1, . . . , l . This vector is modeled by a mixed-effects sub-model, where β k denotes the regression coefficients associated with the design vector for the fixed effects x ik (t) . Besides, z ik (t) denotes the design vector for the random b ik for the commune i . Finally, ε ik (t) represents the model error term. The longitudinal sub-model included fixed intercepts and random slopes. The joint model is completed with the time-to-event sub-model. In this case, the outcome is modeled by a proportional hazard. This kind of strategy focuses directly on the hazard function h i (t) , considering the baseline hazard function, h 0 (t) , and a second term that includes baseline covariates, w i , and the true and unobserved value of longitudinal outcome for the commune i at time t , which is denoted by m ik (t) and modeled by the longitudinal sub-model. Finally, α represents the association between the longitudinal and time-to-event outcome.
In this study, we were interested in investigating the time until a Chilean commune comes out of lockdown, which motivated the use of time-to-event sub-model. In addition, we wanted to add epidemiological information to the study; such information was obtained at the commune-level and over time during the follow-up period. Consequently, the most indicated strategy was to build a mixed longitudinal sub-model. Finally, the joint approach connected both parts, including the information obtained by the longitudinal model in the time-toevent model. Active cases, ICU patients, and deaths were logarithmically transformed for joint analysis, while the positivity index was handled as proportion.
Model building process and model validation. The model was built in two stages. First, we fitted univariate joint models for each longitudinal factor (active cases, ICU patients, deaths, and positivity index). At this point, we identified that the association between the number of deaths and the duration of lockdown was not statistically significant, i.e., the p-value was higher than 0.05. Then, a bivariate joint model was fitted, considering pairs of the significant variables. Finally, a multivariate joint model was fitted; however, the association between ICU patients and the duration of lockdown was not statistically significant. Consequently, the bivariate model, including the number of active cases and the positivity index, was considered the final version.
The second stage was aimed to select the social and demographic covariates included in the bivariate joint model. To do so, we used the stepwise backward elimination approach, starting from a full model, which included all the predictors described in the previous section. Covariates with the highest p-values were removed from the model one at a time until all predictors were below the significance threshold (p-value ≤ 0.05). Finally, the joint model included two longitudinal information, number of active cases and positivity index, and one demographic and socioeconomic factor, overcrowding. All the statistical analyses were performed with R statistical software (version 4.1.0), with the level of significance set at 5%. Package joineRML was used to fit the joint model extended to multiple continuous longitudinal measures 11 .

Results
Descriptive analysis. Between March 25th and December 25th, 2020, 147 communes were under lockdown at least once and 19 (11.4%) twice, which resulted in a sample size (N) of 166. The lockdown duration was between 7 and 172 days, being 63 the average time. Of all communes, 16 (9.6%) are regional capitals, and 26 (15.7%) are province capital cities. In addition, there are commercial airports in 9 (5.4%) of them and commercial harbor in 10 (6%). Commune's population size ranged between 1983 and 645,909 inhabitants, with a median population of 82,110 and median overcrowding of 2.73 people per household. Table 1 shows descriptive statistics for the outcome and predictors. www.nature.com/scientificreports/ According to the United Nations, in 2019, 4.92% of the Chilean population was composed of immigrants, totaling almost one million people. Particularly in this study, only four of the 147 communes had no immigrants. On the other hand, the Chilean capital was the one with greater concentration of immigrants, 22,008 per 100,000 inhabitants. The rurality index of the communes also varied greatly: while some communes were entirely urban, others included almost 80% of rural areas. Something similar happens with SDI, which varied between 26 and 99%.
Regarding epidemiological characteristics, there was a great variation across the communes. The number of active cases ranged from zero up to 2925 cases per 100,000 inhabitants, while the number of people who entered the ICU varied between 1 and 18. There was also a great variation in the number of deaths per COVID-19, reaching 1449 deaths per 100,000 inhabitants. The positivity index varied between 1 and 53% depending on the commune and the week. More details about the continuous features considered in this study are displayed in Table 1. Table 1. However, covariates with a p-value higher than the significance threshold were removed. Thus, population, population density, immigrants, SDI, rural index, number of ICU beds, and number of deaths were eliminated. The final model included overcrowding, number of active cases, and positivity index.

Joint model. The model building process started considering all predictors shown in
Results from the longitudinal sub-model showed that the number of active cases and the percentage of positivity decreased during the lockdown (Table 2). However, besides identifying this behavior, our primary goal was to know whether the number of active cases and positivity were associated with the duration of lockdowns. Thus, in the joint model, α indicates the change in the hazard for a unit change in the underlying subject-specific value of the longitudinal outcome, determining the strength of the association. Specifically, both longitudinal outcomes, number of active cases and percentage of positivity, and time-to-event outcome were significantly associated ( α 1 and α 2 , respectively).
From a survival perspective, an increase on either active cases or positivity index implies a longer lockdown. The increment of one unit on the logarithm of active cases increases 53.7% the risk of staying under lockdown, i.e., a higher number of active cases implies longer lockdown. Similarly, for every additional percent unit of www.nature.com/scientificreports/ the positivity index, the risk of the lockdown being maintained goes up by 21.4% (HR = exp{−24.120 * 0.01} ).
Regarding demographic and socioeconomic factors, the joint model showed that overcrowding was significantly associated with the lockdown duration (p-value < 0.05), Table 2. For every additional unit of overcrowding, the risk of staying in lockdown goes up by 60.5% (HR = 0.395). Joint model of longitudinal and time-to-event data is also a valuable tool to obtain predictions of survival probabilities. As detailed by Andrinopoulou et al. 12 , the predictions can be updated each time a new longitudinal measurement is available. Figure 1 shows predictions of the risk of lockdown being maintained for the commune of Santiago according to active cases and positivity observed at up to five. To do so, we set positivity as 0.25, 0.3, 0.32, 0.35, and 0.3. Similarly, active cases (in log scale) are considered as 3, 4, 4.5, 5 and 4.3.

Reproducibility material.
To facilitate the reproducibility of this study, both the data and the R code are provided in the supplementary materials, available at https:// github. com/ COVID 0248/ JM. Besides, details about the data sources are also available. Although we have not presented several details about joint models, we encourage our readers to look for more details about this modeling strategy. Asar et al. 13 and Andrinopoulou et al. 12 presented good tutorials focusing on how to use joint models in different contexts in epidemiology.

Discussion
In clinical and epidemiological research, it is common to have longitudinal and time-to-event outcomes in the same study. Then, we have realized that similar situations also occur in other scenarios, such as COVID-19 lockdowns. In this study, we proposed a joint modeling framework to investigate more about Chilean lockdowns and their related factors. More studies using the joint modeling approach may be found in the literature. For instance, Chen et al. 14 used a similar strategy for developing a dynamic risk prediction model for COVID-19 prognosis considering longitudinal measures. Lu et al. 15 used a joint approach to study the association of oxygen saturation to the fraction of inspired oxygen ratio and time to death of patients diagnosed with COVID-19. However, to the best of our knowledge, no similar studies on the duration of lockdowns are available in the literature. Hence, this study aimed to investigate how demographic, socioeconomic, and epidemiological factors are associated with the duration of lockdowns in Chile. We used a multivariate joint model to explain these associations, providing valuable and practical information to support governmental decisions regarding local lockdowns. Apart from epidemiological reports (e.g., active cases and positivity index), we found that demographic and socioeconomic information (e.g., overcrowding) was also significantly associated with the duration of local lockdowns. www.nature.com/scientificreports/ From an epidemiological point of view, it is essential to clarify that this study was not designed to determine whether the implementation of lockdowns at the commune level in Chile was an effective measure in terms of reducing contagion, but rather to explore whether the permanence of the communes under lockdown obeyed what we could call a "sanitary logic". The variables considered by the authorities to keep a commune under lockdown in Chile are numerous and heterogeneous (e.g., reduction in the number of active cases, occupation of ICU beds, basic reproductive number, ability to trace and isolate cases, among others). Our study revealed that the decision made by the authorities followed what we previously called "sanitary logic", because an increase in the number of active cases and positivity index reflects a worsening in the progress of the pandemic, which would merit keeping a commune under lockdown. Of particular interest is our finding regarding the variable overcrowding (i.e., number of people over the number of households), whose intensity also finished up as a factor associated with maintaining a lockdown. A possible explanation would be the following: the communes with the most significant overcrowding at the household level are communes in which there is a greater chance of intrahousehold contagion, thereby increasing the number of active cases and the positivity index in the commune, both being variables associated with the maintenance of lockdowns. Nevertheless, considering the complexity of the multivariable phenomenon under study, it is difficult to establish cause-effect relations, especially with design restrictions. Furthermore, factors such as mobility or traceability of cases, which we do not have information, could also affect the duration of the lockdowns. This would imply a potential bias in our results.
Finally, since many countries worldwide have established lockdowns, it would be interesting to compare our results with those obtained in other nations and thus evaluate their external validity. However, this is not a straightforward comparison once the term "lockdown" is not well defined. In fact, according to WHO, this term refers to large-scale physical distancing measures and movement restrictions, which indicates an unclear definition. Indeed, there are different adjectives used for the term, such as "total", "full", "hard", "partial", and/ or "soft" lockdown suggesting also different degrees of restrictions. In addition, when reviewing literature and international media, we have found a lack of clear and consistent strategies for both entering and exiting lockdown. In the context of easing lockdowns, Han et al. 16 analyzed the approaches taken by nine countries. They suggested that countries should base their decisions on a combination of epidemiological, social, economics, and demographic factors, which corroborates ours in spirit.

Code availability
R statistical software version 4.1.0.